1. Did Spotify users in the Netherlands change their music listening behavior during the COVID-19 pandemic?

The sound of COVID-19: Spotify usage in the Netherlands during a pandemic

The COVID-19 pandemic has stirred society up by quite large margin. Many people are (in)directly affected by the health crisis or the resulting governmental measures. This led to adjustments, e.g. social distancing and isolation, causing society to change communication, work and more aspects of daily life. This dashboard will explore the following:

Did Spotify users in the Netherlands change their music listening behavior during the COVID-19 pandemic?

A corpus has been created in order to perform various computational musicological analyses using the spotifyr and compmus packages.

The general listening behavior of Spotify users in the Netherlands before and during the pandemic will be explored, as measured by the Spotify API. In addition, specific events related to the pandemic (e.g. lockdown and curfew) will be considered as well to find to what extent possible changes in listening behavior can be attributed to these events.

Corpus

In order to analyze general listening behavior, the most important variables for the portfolio are:


In order to keep track on the average listening behavior of Dutch Spotify users, the weekly ‘Top 50’ playlists from the Netherlands will be analyzed over time. The years 2019 (52 weeks) and 2020 (53 weeks) and will be measured in its entirety, and 2021 is measured until week 7.

2019 contains 52 playlists consisting of 50 tracks per playlist
2020 contains 53 playlists consisting of 50 tracks per playlist
2021 contains 7 playlists consisting of 50 tracks per playlist Totaling 5600 observations/tracks. As a track can be in the charts for multiple weeks, duplicates occur. The number of unique tracks within the corpus is 826.

Since Spotify autoupdates their playlists, the historical ‘Top 50’ lists in the form of CSV files will be retrieved from Spotify Charts.

The changes of (or lack thereof) listening behavior will be measured by the the different Spotify Audio Features:

• danceability

Danceability describes how suitable a track is for dancing based on a combination of musical elements including tempo, rhythm stability, beat strength, and overall regularity. A value of 0.0 is least danceable and 1.0 is most danceable.

• energy

Energy is a measure from 0.0 to 1.0 and represents a perceptual measure of intensity and activity. Typically, energetic tracks feel fast, loud, and noisy. For example, death metal has high energy, while a Bach prelude scores low on the scale. Perceptual features contributing to this attribute include dynamic range, perceived loudness, timbre, onset rate, and general entropy.

• key

The key the track is in. Integers map to pitches using standard Pitch Class notation . E.g. 0 = C, 1 = C♯/D♭, 2 = D, and so on.

• loudness

The overall loudness of a track in decibels (dB). Loudness values are averaged across the entire track and are useful for comparing relative loudness of tracks. Loudness is the quality of a sound that is the primary psychological correlate of physical strength (amplitude). Values typical range between -60 and 0 db.

• mode

Mode indicates the modality (major or minor) of a track, the type of scale from which its melodic content is derived. Major is represented by 1 and minor is 0.

• speechiness

Speechiness detects the presence of spoken words in a track. The more exclusively speech-like the recording (e.g. talk show, audio book, poetry), the closer to 1.0 the attribute value. Values above 0.66 describe tracks that are probably made entirely of spoken words. Values between 0.33 and 0.66 describe tracks that may contain both music and speech, either in sections or layered, including such cases as rap music. Values below 0.33 most likely represent music and other non-speech-like tracks.

• acousticness

A confidence measure from 0.0 to 1.0 of whether the track is acoustic. 1.0 represents high confidence the track is acoustic.

• instrumentalness

Predicts whether a track contains no vocals. “Ooh” and “aah” sounds are treated as instrumental in this context. Rap or spoken word tracks are clearly “vocal”. The closer the instrumentalness value is to 1.0, the greater likelihood the track contains no vocal content. Values above 0.5 are intended to represent instrumental tracks, but confidence is higher as the value approaches 1.0.

• liveness

Detects the presence of an audience in the recording. Higher liveness values represent an increased probability that the track was performed live. A value above 0.8 provides strong likelihood that the track is live.

• tempo

The overall estimated tempo of a track in beats per minute (BPM). In musical terminology, tempo is the speed or pace of a given piece and derives directly from the average beat duration.

• valence

A measure from 0.0 to 1.0 describing the musical positiveness conveyed by a track. Tracks with high valence sound more positive (e.g. happy, cheerful, euphoric), while tracks with low valence sound more negative (e.g. sad, depressed, angry).

• duration_ms

The duration of the track in milliseconds.

Also the following variables obtained through the Spotify API will be included:

  • Number of streams

  • Position

  • Track Name

  • Artist

  • Streams

The variable time will be used to identify the different weeks as well as the periods before and during the pandemic that may explain the changes in music listening behavior from the top and viral playlists.

In addition interesting annual periods will be isolated to see if similar patterns reoccur during the pandemic. For example, the December Holiday season before and during the pandemic will be analyzed to identify whether Spotify users altered their Christmas related listening behavior.

  • Week
  • Year

The corpus measures from week 1, 2019 till week 7, 2021 and will split the data into two periods. A period Before the pandemic and During the pandemic. This will make it clearer to attribute analyses to these periods, rather than annually or weekly.

Alongside the musical analyses, statistics concerning COVID-19 will be taken into account as well. The used data is provided by the The Dutch National Institute for Public Health and the Environment (RIVM). The data has been pre-processed to include both weekly and cumulative data. The variables that are included in this dashboard are the following variables:

  • Number Hospital Admissions

  • Number of Deaths

  • Reported cases of COVID-19

3. Trip down memory lane:
Comparing pre-pandemic to intra-pandemic listening behavior



In this frame you make week-for-week comparisons for different variables based on the period before or during the pandemic.

The interesting variables to compare the different periods are valence and energy as these reflect the valance/arousal model that shows the emotions Happy, Angry, Sad, Relaxed.

The corpus is spread fairly evenly, and both before and during the pandemic most of the Top 50 tracks are in the Happy quadrant. This is not very surprising, as the Top 50 usually mostly consist of pop songs.

In the weekly plots it is observed that the average valence is much higher during the pandemic. And that the average weekly streams follow a similar trend, with some anomalies. These will be explored further down the line.

5. 17 Miljoen Mensen vs. 15 Miljoen Mensen - The prominent cover song during the pandemic shows little similarity with original


DTW and Chromagrams

The track “17 Miljoen Mensen” (2020) is a cover of “15 Miljoen Mensen” (1996). An analysis of the chromafeatures of the two tracks aims find similarities between them. Notice the d 17 Miljoen mensen’s title adjustment for the population increase of 2 million people, and its shortness with a duration of just 1 minute and 47 seconds. But what are other differences or similarities?

The first plot shows the Dynamic Time Warping plot of the two tracks, using Euclidean norm and angular distance. A diagonal pattern would denote similarity between the two tracks. This is not observed, which implies significant differences between the two tracks. This is supported as the the table shows that the pitch classes differ. According to the Spotify API, “17 Miljoen Mensen” is in the key of G major, wheras “15 Miljoen Mensen” is in the key of C major. This is not explicitly shown, but they are represented in their respective chromagrams.

In addition, the ‘sound and feel’ of the tracks differ: 15 miljoen mensen has a higher danceability, energy, and loudness, whereas “17 miljoen mensen” has a much higher acousticness and liveness (due to the recording being a live performance).

A remarkable commonality probably explains the differences: Both tracks were unintended single releases, “15 miljoen mensen” was initially written for a commercial, and “17 Miljoen mensen” as a tribute for a (due to COVID-19) canceled music concert. The different motivations behind the tracks reflects the different ‘sound and feel’ as shown by Spotify API.

For a commercial you would want a more catchy/upbeat track, contrary to a song related to a disaster or crisis. This explains the difference in loundness, “15 Miljoen Mensen” has a loudness of -10.041dB, wheras “15 Miljoen Mensen” has a loudness of -7.063dB.

Spotify Features Table

17 Miljoen Mensen (2020) 15 Miljoen Mensen (1996)
danceability 0.493 0.547
energy 0.321 0.631
key 7 0
loudness -10.041 -7.063
mode 1 1
speechiness 0.0402 0.0266
acousticness 0.715 0.0943
instrumentalness 0 0
liveness 0.0863 0.0548
valence 0.508 0.481
tempo 86.77 79.02
duration_sec 107.2 236.107
time_signature 4 4

Error: Embedded data could not be displayed.

6. All I Want before Christmas… is Christmas
Earlier Christmas in 2020 due to the lockdown.


Christmas songs started to dominate the charts in 2020 from around week 49 until week 53, whereas in 2019 Christmas this phenomenon occurred a bit later. In 2020 it is noticeable that the bottom right corner contain tracks with relatively high BPM, high valence, lower energy and lower danceability.During these weeks Christmas tracks dominated the charts. In 2019 this phenomenon is very noticeable in week 52, but shows that Christmas slowly started in week 50. Also in 2020, the charts remained similar during the holiday period from week 50 to 53, whereas in 2019 week 52 saw a spike of the Christmas related audio features. This pattern implies more Christmas tracks entered the Top 50.

Interestingly, Mariah Carey’s ‘All I Want for Christmas’ topped the charts for four consecutive weeks in 2020, as opposed to 1 week in 2019.

A Possible explanation is that due to the imposed lockdown and other restrictions, people may have felt a need or desire for the “Christmas Spirit/Vibes” a week earlier than in 2019.

Another interesting discovery is that similar to 2019, the top streams in 2020 decreased in similar fashion. A possible explanation is that people disregarded the lockdown regulations and spent the holiday season with friends and/or family or were preoccupied with other activities to keep in touch with them.

7. Self-Similarity Matrix: “Dance Monkey” Shows repeating pattern and noticeably distinct Millennial Whoop


Dance Monkey

“Dance Monkey” by Tones And I is one of the most popular tracks within the corpus. A structure analysis will show possible patterns of sequences within the track and their relation.

Cepstrogram

The first cepstrogram plot shows the magnitude of each timbre feature per segment of the track. The feature c01 is loudness, c02 is low frequency, c03 is mid frequencies. c04 and up are not defined as straight forward, but they may be implied by keeping track of changes within a track during specific segments. The cepstrogram shows that "Dance Monkey’s timbre features are relatively more defined by c01 to c05.

  • c01 Loudness: The segments reflect the loudness of the track, this is especially noted during the final chorus.
  • c02 Darkness: The segments faintly show a higher magnitude when the bass drum hits. But its omission is noted much clearly during the breakdown starting at 150 seconds.
  • c03 Mid frequency: It’s shown at about 50 seconds and 165 seconds when higher notes are less and more distinct respectively.
  • c04 Attack: This is very prevalent during the intro (vocal stretch fade-in sfx).
  • c05 [Unknown]: It has the highest magnitude at around 150 seconds, noticeable is the loudness of the “Millenial Whoop”.
Self Similarity

The second and third plots are Self Similarity Matrices (SSM); The first being pitch, and the second timbre. These plots show the structure of a track by denoting patterns of similarities that reoccur. Diagonal lines and a checkerboard pattern show similarity and repetition.

The timbre SSM is plotted using Euclidean norm, Euclidean distance and summarized by the mean. The plot shows a faint checkerboard pattern which implies some form of repetition in the track. At the 150 second mark there is a significant timbre difference. This is when the breakdown occurs with the earlier mentioned “Millenial Whoop”.

The pitch SSM is plotted using Euclidean norm, cosine distance and summarized by root mean square. This plot shows a slightly more noticeable checkerboard pattern. At the 150 second mark, the plot shows a significant change.


Error: Embedded data could not be displayed.

8. In the mood for which keys?
Chord and Key estimations for “Mood”.


The track Mood by 24kGoldn ft. iann diorr is also one of the identified popular tracks in the corpus. A keygram and chordogram are plotted in order to show the tonal progression of the track by estimating the chords and key for each segment.

The keygram shows that the key E♭ major, G minor, F major, C major, G major, and C♯minor are prevalent keys during the track. The Chordogram show that the chords C minor, E♭ 7, and E♭ major are the most prevalent chords of the track.

Spotify API

According to the Spotify API, this track is written in the 7th key, with mode 0: meaning G minor.

Chordify

The Chordify algorithm identified the chords within the following (4/4) loop:

E♭ - Gm - | B♭ - F - |

The identified key appears to be on the natural scale:

G - A - B♭ - C - D - E♭ - F


Error: Embedded data could not be displayed.

9. Histogram of Keys within the corpus shows C♯ as the most common key.


While histogram doesn’t show a clear/unanimous preference, the keys C♯, F♯, G♯ consistently do have a relative high count within the corpus.

In 2019, There is a clear significant higher count of C♯, F, G, B key.

In 2020, The keys C♯, F♯, G♯, B have a significantly higher frequency in the corpus.

Note that 2021 only contains the first 7 weeks, whereas 2019 and 2020 contain 52 and 53 weeks respectively. Therefore, its not very representative to make the most informed comparisons.

10. Most prevalent beats:
Tempi around 100 BPM and 120 BPM most common in corpus.


The density plot shows that overall the the most frequent tempi within the corpus is around 90-100 BPM and 115-128 BPM. The year 2019 showed a strong preference for tracks around 98 BPM and to a lesser extent 123 BPM.

The year 2020 showed a strong preference for tracks around both 95 BPM and 121 BPM.

The first 7 weeks of 2021 showed a preference for tracks around 99 BPM and 122 BPM.


Average Tempo:

  • 117

    BPM

  • 116

    BPM

  • 121

    BPM

Press one of these buttons to display the average BPM

11. “Tigers” by Bilal Wahib has a tempo of 112 BPM


Another popular track in the corpus is “Tigers” by Bilal Wahib. Tempograms are plotted in order to show the estimated BPM of the track along its duration.

The tempo feature of the Spotify API estimates a BPM of 111.943 (rounded 112 BPM).

The first Tempogram doens’t explicitly reflect the estimation of the Spotify API, tempi of around 210-220 and 430-450 are shown in the plot. The plot might record the represent half-time and quarter time BPM’s of 224 and 448 (based on the estimation of 112 BPM).

The second Tempogram (cyclic), is adjusted to represent the more ‘common’ tempi at which humans tap. This plot does reflect the Spotify API estimation of 112 BPM more clearly.

At the 75 second mark, there is a slight drop and increase in tempo. From this point, noticable is the tape stop sound effect, which is immediately followed by the bridge. The BPM however, remains the same (try to tap along).


Error: Embedded data could not be displayed.

12 X


The corpus has a large variety and average similarity of a Audio features, and is solely based on popular tracks. This combined with the timeframe of roughly one year, predicting whether a track belongs before or during the pandemic is quite unfeasible. The music between 2019 and 2021 is too similar to distinguish a pattern of tracks topping the chart in the two time periods. Comparing different musical “eras” (e.g. 70’s, 80’and 90’s) would render better results.

12 X


The corpus has a large variety and average similarity of a Audio features, and is solely based on popular tracks. This combined with the timeframe of roughly one year, predicting whether a track belongs before or during the pandemic is quite unfeasible. The music between 2019 and 2021 is too similar to distinguish a pattern of tracks topping the chart in the two time periods. Comparing different musical “era’s” (e.g. 70’s, 80’and 90’s) would render better results.

12 X


The corpus has a large variety and average similarity of a Audio features, and is solely based on popular tracks. This combined with the timeframe of roughly one year, predicting whether a track belongs before or during the pandemic is quite unfeasible. The music between 2019 and 2021 is too similar to distinguish a pattern of tracks topping the chart in the two time periods. Comparing different musical “era’s” (e.g. 70’s, 80’and 90’s) would render better results.

Conclusion

Conclusion